AITopics | linear program

statements and

Neural Information Processing SystemsApr-29-2026, 14:25:40 GMT

Let a two-player Markov game where both players affect the transition. We will effectively show that the problem of best-responding to a correlated policy σ is526 equivalent to best-responding to the marginal policy of σ for the opponent. The proof follows from527 the equivalence of the two MDPs.528 Before that, given a (possibly correlated) joint policy σ we define a nonlinear program, (PBR), whose539 optimal solutions are best-response policies of each agent k to σ k and the values for each state s540 and timestep h:541 A.2 Proof of Theorem 3.2542 The best-response program. First, we state the following lemma that will prove useful for several543 of our arguments,544 Lemma A.1 (Best-response LP).

artificial intelligence, global minimum, value function, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)

Add feedback

a922b7121007768f78f770c404415375-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-27-2026, 01:20:29 GMT

abstraction, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.15)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.34)

Add feedback

Equilibrium Refinement for the Age of Machines: The One-Sided Quasi-Perfect Equilibrium

Neural Information Processing SystemsApr-25-2026, 18:22:37 GMT

In two-player zero-sum extensive-form games, Nash equilibrium prescribes optimal strategies against perfectly rational opponents. However, it does not guarantee rational play in parts of the game tree that can only be reached by the players making mistakes. This can be problematic when operationalizing equilibria in the real world among imperfect players. Trembling-hand refinements are a sound remedy to this issue, and are subsets of Nash equilibria that are designed to handle the possibility that any of the players may make mistakes. In this paper, we initiate the study of equilibrium refinements for settings where one of the players is perfectly rational (the "machine") and the other may make mistakes.

artificial intelligence, equilibrium, game theory, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Games > Poker (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.46)

Add feedback

Equilibrium Refinement for the Age of Machines: The One-Sided Quasi-Perfect Equilibrium

Neural Information Processing SystemsApr-25-2026, 18:22:33 GMT

In two-player zero-sum extensive-form games, Nash equilibrium prescribes optimal strategies against perfectly rational opponents. However, it does not guarantee rational play in parts of the game tree that can only be reached by the players making mistakes. This can be problematic when operationalizing equilibria in the real world among imperfect players. Trembling-hand refinements are a sound remedy to this issue, and are subsets of Nash equilibria that are designed to handle the possibility that any of the players may make mistakes. In this paper, we initiate the study of equilibrium refinements for settings where one of the players is perfectly rational (the "machine") and the other may make mistakes.

artificial intelligence, equilibrium, game theory, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Games > Poker (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.46)

Add feedback

41bacf567aefc61b3076c74d8925128f-Supplemental.pdf

Neural Information Processing SystemsApr-25-2026, 14:56:28 GMT

artificial intelligence, eigenvector, vertex, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.48)

Add feedback

Appendix: On the Expressivity of Markov Reward

Neural Information Processing SystemsApr-25-2026, 14:37:29 GMT

We first address questions that might arise in response to the main text. That is, if Alice chooses a SOAP, PO, or TO for Bob to learn to solve, when can Alice determine Bob has solved the task? A: Bob can be said to be doing better on a given task if his behavior improves, as is typical in evaluating behavior under reward. The difference with SOAPs, POs, and TOs is that we measure improvement relative to the task rather than reward. For instance, given a SOAP, we might say that Bob has solved the task once he has found one of the good policies, and we might measure Bob's progress on a task in terms of the distance of his greedy policy to one of the good policies (as done in our learning experiments). The same reasoning applies to POs and TOs: Bob is doing better on a task in so far as his greedy policy (or trajectories) is (are) higher up the ordering. That is, the studied reward functions must be a function of s, (s,a), or (s,a,s0). A: Indeed, as discussed in our introduction, our goal is to examine the expressivity of Markov rewards in the context of finite MDPs.

artificial intelligence, machine learning, reward function, (18 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.96)

Add feedback

2022DOPEsuppl

Archana Bura

Neural Information Processing SystemsApr-24-2026, 09:51:06 GMT

algorithm, artificial intelligence, inequality, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)

Add feedback

Fundamental Limits of Prompt Compression: A Rate-Distortion Framework for Black-Box Language Models

Neural Information Processing SystemsMar-22-2026, 01:03:13 GMT

We formalize the problem of prompt compression for large language models (LLMs) and present a framework to unify token-level prompt compression methods which create hard prompts for black-box models. We derive the distortion-rate function for this setup as a linear program, and provide an efficient algorithm to compute this fundamental limit via the dual of the linear program. Using the distortion-rate function as the baseline, we study the performance of existing compression schemes on a synthetic dataset consisting of prompts generated from a Markov chain, natural language queries, and their respective answers. Our empirical analysis demonstrates the criticality of query-aware prompt compression, where the compressor has knowledge of the downstream task/query for the black-box LLM. We show that there is a large gap between the performance of current prompt compression methods and the optimal strategy, and propose Adaptive QuerySelect, a query-aware, variable-rate adaptation of a prior work to close the gap. We extend our experiments to a small natural language dataset to further confirm our findings on our synthetic dataset.

artificial intelligence, large language model, natural language, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.83)

Add feedback